Dependability Prediction of High Availability OSCAR Cluster Server

نویسندگان

  • Chokchai Leangsuksun
  • Lixin Shen
  • Tong Liu
  • Hertong Song
  • Stephen L. Scott
چکیده

High availability (HA) computing has recently gained much attention, especially in enterprise and mission critical systems. The HA is now a necessity that is no longer regarded as a luxury feature. Thus, we, conjunctively with the open source community, are in process of enhancing the HA feature to Open Source Cluster Application Resources (OSCAR), a widely adopted Linux PC cluster system. Server redundancy will be our initial key aspect of the next generation HA OSCAR cluster system. In this paper, we introduce a HA server for OSCAR cluster system. Its architecture and mechanism is discussed, and then we model and predict the dependability of the system by a Petri net-based model, Stochastic Reword Net (SRN). The reliability and instantaneous availability of the system are presented as a result.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Modeling and Dependability Analysis of High Availability OSCAR Cluster System

OSCAR is widely used for building and maintaining a high-performance parallel computing system. In many cases, high availability requirement becomes as critical as high performance. In this paper, the current OSCAR cluster system is introduced. Some high availability consideration is discussed and the high availability OSCAR cluster system is presented. Continuous Time Markov Chain models are b...

متن کامل

Availability Prediction and Modeling of High Availability OSCAR Cluster

Since the initial introduction of Open Source Cluster Application Resources (OSCAR), this software package has been a well-accepted choice for building high performance computing systems. As it continues to be applied to mission-critical environments, high availability (HA) features therefore are needed to be included in OSCAR cluster. In this paper, we provide a HA solution for OSCAR cluster. ...

متن کامل

Highly Reliable Linux HPC Clusters: Self-Awareness Approach

Current solutions for fault-tolerance in HPC systems focus on dealing with the result of a failure. However, most are unable to handle runtime system configuration changes caused by transient failures and require a complete restart of the entire machine. The recently released HA-OSCAR software stack is one such effort making inroads here. This paper discusses detailed solutions for the high-ava...

متن کامل

Cluster Application Resources ( OSCAR ) : design , implementation and interest for the [ computer ] scientific community

The Open Source Cluster Application Resources (OSCAR) project is the founding working group of the Open Cluster Group (OCG). The OCG is an informal group of people dedicated to making cluster computing practical for high performance computing and more recently, clustering in general (high availability, diskless). OSCAR is a package that makes it easy to build clusters for high performance compu...

متن کامل

A Reconfigurable High Availability Infrastructure in Cluster for Grid

The paper presents the implementation and analysis of a servicebased reconfigurable High Availability infrastructure of cluster system for Grid. Based on service notion, the High Availability infrastructure is constructed for mission critical applications on Grid. The application high availability service is responsible for registered applications’ high availability. The Service Manager is in c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003